Simplifying Regular Expressions: A Quantitative Perspective

نویسندگان

  • Hermann Gruber
  • Stefan Gulan
چکیده

In this work, we consider the efficient simplification of regular expressions. We suggest a quantitative comparison of heuristics for simplifying regular expressions. We propose a new normal form for regular expressions, which outperforms previous heuristics while still being computable in linear time. We apply this normal form to determine an exact bound for the relation between the two most common size measures for regular expressions, namely alphabetic width and reverse polish notation length. Then we proceed to show that every regular expression of alphabetic with n can be converted into a nondeterministic finite automaton with ǫ-transitions of size at most 4 2 5 n+ 1, and that this bound is optimal. This provides an exact resolution of a research problem posed by Ilie and Yu, who had obtained lower and upper bounds of 4n− 1 and 9n− 1 2 , respectively [L. Ilie, S. Yu: Follow automata. Inform. Comput. 186, 2003]. For reverse polish notation length as input size measure, an optimal bound was recently determined [S. Gulan, H. Fernau: An optimal construction of finite automata from regular expressions. In: Proc. FST&TCS, 2008]. We prove that, under mild restrictions, their construction is also optimal when taking alphabetic width as input size measure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simplifying Regular Expressions

We consider the efficient simplification of regular expressions and suggest a quantitative comparison of heuristics for simplifying regular expressions. To this end, we propose a new normal form for regular expressions, which outperforms previous heuristics while still being computable in linear time. This allows us to determine an exact bound for the relation between the two prevalent measures...

متن کامل

Simplifying Text Processing with Grammatically Aware Regular Expressions

In our paper we introduce Grammatically Aware Regular expression (GARE) and describe its usage using examples from moral consequences retrieval task. GARE is an extension to the regular expression concept that overcomes many of the difficulties with traditional regexp by adding Normalization (e.g., searching all grammatical forms with basic form of a verb or adjective is possible) or POS awaren...

متن کامل

An Introduction to the StreamQRE Language

Real-time decision making in emerging IoT applications typically relies on computing quantitative summaries of large data streams in an efficient and incremental manner. We give here an introduction to the StreamQRE language, which has recently been proposed for the purpose of simplifying the task of programming the desired logic in such stream processing applications. StreamQRE provides natura...

متن کامل

Pragmatic expressions in cross-linguistic perspective

This  paper  focuses  on  some  pragmatic  expressions  that  are  characteristic  of  informal  spoken English, their possible equivalents in some other languages, and their use by EFL learners from different  backgrounds.  These  expressions,  called  general  extenders  (e.g.  and  stuff,  or something), are shown to be different from discourse markers and to exhibit variation in form, funct...

متن کامل

Derivatives of Quantitative Regular Expressions

Quantitative regular expressions (QREs) have been recently proposed as a high-level declarative language for specifying complex numerical queries over data streams in a modular way. QREs have appealing theoretical properties, and each QRE can be compiled into an efficient streaming algorithm for its evaluation. In this paper, we generalize the notion of Brzozowski derivatives for classical regu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009